HW 03

Author

Amit Chawla

Initial Setup

if (!require("pacman")) 
  install.packages("pacman")
Loading required package: pacman
pacman::p_load(tidyverse,
               janitor,
               colorspace,
               broom,
               fs,
               scales,
               ggthemes,
               ggrepel,
               patchwork,
               ggimage,
               jpeg,
               glue,
               grid,
               forcats)

# set theme for ggplot2
ggplot2::theme_set(ggplot2::theme_minimal(base_size = 14))

# set width of code output
options(width = 65)

# set figure parameters for knitr
knitr::opts_chunk$set(
  fig.width = 7, # 7" width
  fig.asp = 0.618, # the golden ratio
  fig.retina = 3, # dpi multiplier for displaying HTML output on retina
  fig.align = "center", # center align figures
  dpi = 300 # higher dpi, sharper image
)

1 - Du Bois challenge.

# 1. Base data
du_bois_income <- read_csv("data/income.csv")
Rows: 7 Columns: 7
── Column specification ─────────────────────────────────────────
Delimiter: ","
chr (1): Class
dbl (6): Average_Income, Rent, Food, Clothes, Tax, Other

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# 2. Pivot + cleanup with reversed category order
du_bois_long <- du_bois_income |>
  pivot_longer(cols = Rent:Other, names_to = "Category", values_to = "Expenditure") |>
  mutate(
    Category = fct_relevel(Category, "Rent", "Food", "Clothes", "Tax", "Other"),  # For legend order
    Class = fct_rev(fct_inorder(Class))
  ) |>
  arrange(Class, desc(Category)) |>  # Reverse category order within each class
  group_by(Class) |>
  mutate(
    CenterPos = cumsum(Expenditure) - 0.5 * Expenditure,  # Recalculate positions
    Label = paste0(Expenditure, "%"),
    LabelColor = ifelse(Category == "Rent", "white", "black")
  ) |>
  ungroup()

# 3. Colors (match reference)
colors <- c("Rent" = "black", "Food" = "purple", "Clothes" = "sienna1", "Tax" = "slategray1", "Other" = "snow2")

# 4. Background image
bg_path <- "images/du-bois-bg.jpg"
bg <- jpeg::readJPEG(bg_path)
bg_grob <- rasterGrob(bg, width = unit(1, "npc"), height = unit(1, "npc"), interpolate = TRUE)

# 5. Final Plot
ggplot(du_bois_long, aes(x = Class, y = Expenditure, fill = Category)) +
  annotation_custom(bg_grob, xmin = -Inf, xmax = Inf, ymin = -Inf, ymax = Inf) +
  geom_col(width = 0.5, color = "white") +
  geom_text(aes(y = CenterPos, label = Label, color = LabelColor),
            size = 3, show.legend = FALSE, family = "mono") +
  scale_color_identity() +
  scale_fill_manual(values = colors, breaks = c("Rent", "Food", "Clothes", "Tax", "Other")) +
  scale_x_discrete(labels = c("1,000     $1,125 \nAND OVER          ", "$750-1000   $880   ", "$500-750   $547   ", "$400-500   $433.82", "$300-400   $335.66", "$200-300   $249.45", "$100-200   $139.10")) +
  coord_flip(clip = "off") +
  labs(
    x = NULL, y = NULL,
    title = "INCOME AND EXPENDITURE OF 150 NEGRO FAMILIES IN ATLANTA, GA., U.S.A."
  ) +
  theme_minimal(base_family = "mono") +
  theme(
    legend.position = "top",
    plot.title = element_text(face = "bold", size = 11, hjust = 0.5),
    panel.grid = element_blank(),
    axis.text.x = element_blank(),
    axis.text.y = element_text(size = 7),
    text = element_text(size = 9),
    legend.title = element_blank(),
    legend.text = element_text(size = 9),
    legend.key.size = unit(0.3, "cm")
  ) +
  annotate(
    geom = "text",
    x = "$100-200",
    y = -0,
    label = "CLASS    ACTUAL AVERAGE                                                                                  \n\n\n",
    size = 2,
    color = "black",
    family = "mono",
    hjust = 0
  )

2 - COVID survey - interpret

Interpretation of the COVID-19 Vaccine Attitudes Visualization

This visualization is packed with information about how medical and nursing students across the U.S. feel about the COVID-19 vaccine. It’s a grid of facets where each column represents a different statement about the vaccine, rated on a Likert scale (1 = Strongly Agree to 5 = Strongly Disagree), and each row breaks down the responses by demographic factors like age, gender, race, profession, and vaccination status. The points show the mean score for each group, and the error bars stretch from the 10th to 90th percentiles, giving a sense of how much opinions vary. The first row, labeled “All,” shows the overall responses without splitting by any demographic factors.

Overall Observations

Starting with the “All” row, it’s cool to see that most students lean toward positive views about the vaccine. For statements like “I believe the vaccine is safe,” “Getting the vaccine will make me feel safer at work,” and “I will recommend the vaccine to others,” the mean scores hover around 1.5 to 2, which means they’re generally agreeing—either strongly or somewhat. That feels right to me since these are future healthcare pros who probably trust science. But then, for “I am concerned about the safety and side effects of the vaccine,” the mean jumps to about 3 (neutral), and the error bars go from like 1.5 to 4.5. So, even though they trust the vaccine, a lot of them are still worried about side effects, which I get because it was developed so fast.

Example 1: Asian Respondents’ Mixed Feelings

One thing that caught my eye was in the “Race” category with Asian students. For “I believe the vaccine is safe,” the error bars are huge—like, they go from 1 to 5! That means some strongly agree it’s safe, while others totally disagree, which is wild variation. I didn’t expect that much difference within one group. But then, for “I will recommend the vaccine to others,” the mean is around 2, and the error bars are way tighter, maybe 1 to 3. That’s weird to me—if you’re all over the place on safety, I’d think you’d be unsure about recommending it too. Maybe they feel they should promote it, even if they’re not totally sold on it themselves, like it’s their duty or something during a pandemic.

Example 2: Nursing vs. Medical Students

Looking at the “Profession” row, I noticed nursing and medical students don’t line up as much as I thought they would. For stuff like “I trust the information I have received,” nursing students have tight error bars, maybe 1 to 2.5, so they’re pretty consistent. Medical students, though, have wider ones, like 1 to 4, showing more mixed opinions. I figured both groups would think alike since they’re both in healthcare and learning the same science. This makes me wonder if medical students are more skeptical because they’re digging into research more, while nursing students might just go with what they’re taught. It’s not what I expected!

Example 3: Vaccinated Students Still Worried

In the “Had COVID vaccine” row, students who said “Yes” to being vaccinated have means around 1 to 1.5 for positive statements like “I believe the vaccine is safe,” which makes sense—if you got it, you probably trust it. But for “I am concerned about side effects,” the mean is still around 3, with error bars from 1.5 to 4.5. That’s interesting because I thought if you’re vaccinated, you’d be less worried. It shows even people who took it aren’t totally chill about risks, which fits with how some might’ve gotten it for school or work but still have doubts about a new vaccine.

Wrapping Up

This plot shows that medical and nursing students mostly trust the COVID-19 vaccine, but they’re not all-in—side effect worries are real across the board. The differences between groups, like race or profession, add layers I didn’t expect, making it clear that even future doctors and nurses don’t all see it the same way. It’s pretty fascinating how complicated their views are!

3 - COVID survey - reconstruct

4 - COVID survey - re-reconstruct

5 - COVID survey - another view